71 research outputs found

    Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence

    Full text link
    Open-source software (OSS) supply chain enlarges the attack surface, which makes package registries attractive targets for attacks. Recently, package registries NPM and PyPI have been flooded with malicious packages. The effectiveness of existing malicious NPM and PyPI package detection approaches is hindered by two challenges. The first challenge is how to leverage the knowledge of malicious packages from different ecosystems in a unified way such that multi-lingual malicious package detection can be feasible. The second challenge is how to model malicious behavior in a sequential way such that maliciousness can be precisely captured. To address the two challenges, we propose and implement Cerebro to detect malicious packages in NPM and PyPI. We curate a feature set based on a high-level abstraction of malicious behavior to enable multi-lingual knowledge fusing. We organize extracted features into a behavior sequence to model sequential malicious behavior. We fine-tune the BERT model to understand the semantics of malicious behavior. Extensive evaluation has demonstrated the effectiveness of Cerebro over the state-of-the-art as well as the practically acceptable efficiency. Cerebro has successfully detected 306 and 196 new malicious packages in PyPI and NPM, and received 385 thank letters from the official PyPI and NPM teams

    Limit results on pattern entropy

    Full text link

    Association Between RT-Induced Changes in Lung Tissue Density and Global Lung Function

    Get PDF
    To assess the association between RT-induced changes in computed tomography (CT)-defined lung tissue density and pulmonary function tests (PFTs)

    Correlates of loneliness in older adults in Shanghai, China: does age matter?

    No full text
    Abstract Background Loneliness is a public health concern with serious health consequences in older adults. Despite a large body of research on the correlates of loneliness, little is known about the age group differences in the correlates. Given that the older adult population is heterogeneous, this study aims to examine the correlates of loneliness in older adults in Shanghai, and to explore how the correlates vary across different age groups. Methods We used the Shanghai Urban Neighborhood Survey (SUNS) which was conducted in 2016 and 2017. The total sample size of older adults included in the analysis was 2770. Loneliness was measured using the sum of the 6 items derived from the De Jong Gierveld Loneliness Scale. Correlates include demographic variables, health conditions, social factors, and new media use. Regression analysis was used to examine the correlates of loneliness first in the whole sample, and then in the young old (60–79 years old) and the old old (80+ years old) separately. Results The mean of loneliness score was 18.48 (SD = 5.77). The old old reported a higher level of loneliness than the young old. Variables, including age, living arrangement, marital status, education, health, family functioning, volunteering, square dancing, and new media use were found to be significant in the whole sample. Most of the significant correlates observed in the young old were identical to the findings reported for the total sample, with an exception for living arrangement. Self-rated health (SRH) and family functioning were two important correlates for the old old. Conclusions Correlates of loneliness vary for the young old and the old old. The older adults at higher risk of loneliness deserve more attention and concern. Future interventions should be tailored to the young old and the old old to better help older adults alleviate loneliness and enhance their well-being

    Universal compression of memoryless sources over unknown alphabets

    No full text
    It has long been known that the compression redundancy of independent and identically distributed (i.i.d.) strings increases to infinity as the alphabet size grows. It is also apparent that any string can be described by separately conveying its symbols, and its pattern—the order in which the symbols appear. Concentrating on the latter, we show that the patterns of i.i.d. strings over all, including infinite and even unknown, alphabets, can be compressed with diminishing redundancy, both in block and sequentially, and that the compression can be performed in linear time. To establish these results, we show that the number of patterns is the Bell number, that the number of patterns with a given number of symbols is the Stirling number of the second kind, and that the redundancy of patterns can be bounded using results of Hardy and Ramanujan on the number of integer partitions. The results also imply an asymptotically optimal solution for the Good-Turing probability-estimation problem

    Semi-Supervised Learning by Local Behavioral Searching Strategy

    No full text
    Abstract: Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Among these methods, a very popular type is semi-supervised support vector machines. However, parameter selection in heat kernel function during the learning process is troublesome and harms the performance improvement of the hypothesis. To solve this problem, a novel local behavioral searching strategy is proposed for semi-supervised learning in this paper. In detail, based on human behavioral learning theory, the support vector machine is regularized with the un-normalized graph Laplacian. After building local distribution of feature space, local behavioral paradigm considers the form of the underlying probability distribution in the neighborhood of a point. Validation of the proposed method is performed with extensive experiments. Results demonstrate that compared with traditional method, our method can more effectively and stably enhance the learning performance

    Relational prompt-based single-module single-step model for relational triple extraction

    No full text
    The relational triple extraction is a fundamental and essential information extraction task. The existing approaches of relation triple extraction achieve considerable performance but still suffer from 1) treating the relation between entities as a meaningless label while ignoring the relational semantic information of the relation itself and 2) ignoring the interdependence and inseparability of three elements of the triple. To address these problems, this paper proposes a Relational Prompt approach, based on which constructs a Single-module Single-step relational triple extraction model (RPSS). In particular, the proposed relational prompt approach consist of a relational hard-prompt and a relational soft-prompt, while provide take into account different level of relational semantic information, covering both the token-level and the feature-level relational prompt information. Then, we jointly encode entities and relational prompts to obtain a unified global representation. We mine deep correlations between different embeddings through attention mechanism and then construct a triple interaction matrix. Then, all triples could be directly extracted from a single module in a single step. Experiments demonstrate the effectiveness of the relational prompt approach, as well as relational semantics and triple integrity are essential for relation extraction. Experimental results on two benchmark datasets demonstrate our model outperforms current state-of-the-art models

    Semi-Supervised Learning by Local Behavioral Searching Strategy

    No full text
    Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Among these methods, a very popular type is semi-supervised support vector machines. However, parameter selection in heat kernel function during the learning process is troublesome and harms the performance improvement of the hypothesis. To solve this problem, a novel local behavioral searching strategy is proposed for semi-supervised learning in this paper. In detail, based on human behavioral learning theory, the support vector machine is regularized with the un-normalized graph Laplacian. After building local distribution of feature space, local behavioral paradigm considers the form of the underlying probability distribution in the neighborhood of a point. Validation of the proposed method is performed with extensive experiments. Results demonstrate that compared with traditional method, our method can more effectively and stably enhance the learning performance
    • …
    corecore